146 research outputs found

    Accurator: Nichesourcing for Cultural Heritage

    Full text link
    With more and more cultural heritage data being published online, their usefulness in this open context depends on the quality and diversity of descriptive metadata for collection objects. In many cases, existing metadata is not adequate for a variety of retrieval and research tasks and more specific annotations are necessary. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used for eliciting simple annotations, identifying people with the required expertise might prove troublesome for tasks requiring more complex or domain-specific knowledge. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation tool for experts and 3) validation of the methodology and tool in three case studies. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high quality annotations in a variety of domains and annotation tasks. A user evaluation indicates the tool is suited and usable for domain specific annotation tasks

    Multimedia Annotations on the Semantic Web

    Get PDF
    Multimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich descriptions of multimedia content. On the other hand, the World Wide Web Consortium's (W3C's) Semantic Web effort has been making great progress in advancing techniques for annotating semantics of Web resources. To bridge this gap, a new W3C task force has been created to investigate multimedia annotations on the Semantic Web. This article examines the problems of semantically annotating multimedia and describes the integration of multimedia metadata with the Semantic Web. (Editor's note by John R. Smith)

    Thesaurus-based search in large heterogeneous collections

    Get PDF
    In cultural heritage, large virtual collections are coming into existence. Such collections contain heterogeneous sets of metadata and vocabulary concepts, originating from multiple sources. In the context of the E-Culture demonstrator we have shown earlier that such virtual collections can be effectively explored with keyword search and semantic clustering. In this paper we describe the design rationale of ClioPatria, an open-source system which provides APIs for scalable semantic graph search. The use of ClioPatria’s search strategies is illustrated with a realistic use case: searching for ”Picasso”. We discuss details of scalable graph search, the required OWL reasoning functionalities and show why SPARQL queries are insufficient for solving the search problem

    On the Role of User-generated Metadata in Audio Visual Collections

    Get PDF
    Recently, various crowdsourcing initiatives showed that targeted efforts of user communities result in massive amo

    Searching in semantically rich linked data: a case study in cultural heritage

    Get PDF
    Traditionally the relations between concepts from a controlled vocabulary, such as the hierarchical and associative relations in a thesaurus, have been used to support users in their search process. In the context of the Semantic Web, multiple interlinked vocabularies are becoming available, providing a large number of different relations between concepts. However, for a specific search task, only a small fraction of these will be meaningful to the user, and currently we have little understanding of which methods can be used to determine this. In this paper, we describe a case study in the cultural heritage domain that investigates support for the specific task of finding artworks in a data set of multiple linked art collections and vocabularies. In a first experiment a number of use cases from domain experts ar

    Trusting Semi-structured Web Data

    Get PDF
    Abstract. The growth of the Web brings an uncountable amount of useful information to everybody who can access it. These data are often crowdsourced or provided by heterogenous or unknown sources, therefore they might be maliciously manipulated or unreliable. Moreover, because of their amount it is often impossible to extensively check them, and this gives rise to massive and ever growing trust issues. The research presented in this paper aims at investigating the use of data sources and reasoning techniques to address trust issues about Web data. In particular, these investigations include the use of trusted Web sources, of uncertainty reasoning, of semantic similarity measures and of provenance information as possible bases for trust estimation. The intended result of this thesis is a series of analyses and tools that allow to better understand and address the problem of trusting semi-structured Web data

    LCSH, SKOS and Linked Data

    Get PDF
    A technique for converting Library of Congress Subject Headings MARCXML to Simple Knowledge Organization System (SKOS) RDF is described. Strengths of the SKOS vocabulary are highlighted, as well as possible points for extension, and the integration of other semantic web vocabularies such as Dublin Core. An application for making the vocabulary available as linked-data on the Web is also described.Comment: Submission for the Dublin Core 2008 conference in Berli
    corecore